智能论文笔记

Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning

Kaustubh Sridhar , Vikramank Singh , Balakrishnan Narayanaswamy , Abishek Sankararaman

分类：机器学习

2022-12-02

Cloud computing holds the promise of reduced costs through economies of scale. To realize this promise, cloud computing vendors typically solve sequential resource allocation problems, where customer workloads are packed on shared hardware. Virtual machines (VM) form the foundation of modern cloud computing as they help logically abstract user compute from shared physical infrastructure. Traditionally, VM packing problems are solved by predicting demand, followed by a Model Predictive Control (MPC) optimization over a future horizon. We introduce an approximate formulation of an industrial VM packing problem as an MILP with soft-constraints parameterized by the predictions. Recently, predict-and-optimize (PnO) was proposed for end-to-end training of prediction models by back-propagating the cost of decisions through the optimization problem. But, PnO is unable to scale to the large prediction horizons prevalent in cloud computing. To tackle this issue, we propose the Predict-and-Critic (PnC) framework that outperforms PnO with just a two-step horizon by leveraging reinforcement learning. PnC jointly trains a prediction model and a terminal Q function that approximates cost-to-go over a long horizon, by back-propagating the cost of decisions through the optimization problem \emph{and from the future}. The terminal Q function allows us to solve a much smaller two-step horizon optimization problem than the multi-step horizon necessary in PnO. We evaluate PnO and the PnC framework on two datasets, three workloads, and with disturbances not modeled in the optimization problem. We find that PnC significantly improves decision quality over PnO, even when the optimization problem is not a perfect representation of reality. We also find that hardening the soft constraints of the MILP and back-propagating through the constraints improves decision quality for both PnO and PnC.

translated by 谷歌翻译

Double Auctions with Two-sided Bandit Feedback

Soumya Basu , Abishek Sankararaman

分类：机器学习

2022-08-13

双重拍卖可以使货物在多个买卖双方之间进行分散化转移，从而支持许多在线市场的运作。买卖双方通过竞标在这些市场上竞争，但经常不知道自己的估值A-Priori。随着分配和定价通过出价进行，参与者的盈利能力，因此这些市场的可持续性取决于通过重复互动的各自学习估值的至关重要。我们启动对购买者和卖家方强盗反馈的双重拍卖市场的研究。我们以基于信心的基于信心的招标来展示，“平均定价”参与者之间有有效的价格发现。特别是，交换商品的买卖双方在$ t $ rounds中遗憾的是$ o（\ sqrt {t}）$。不从交易所中受益的买家和卖家又只经历$ o（\ log {t}/ \ delta）$后悔的$ t $ rounds，其中$ \ delta $是最低价格差距。我们通过证明良好的固定价格（一个比双重拍卖更简单的学习问题）来增强我们的上限 - $ \ omega（\ sqrt {t}）$遗憾在某些市场中是无法实现的。

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

Unravelling the Performance of Physics-informed Graph Neural Networks for Dynamical Systems

Abishek Thangamuthu , Gunjan Kumar , Suresh Bishnoi , Ravinder Bhattoo , N M Anoop Krishnan , Sayan Ranu

分类：机器学习

2022-11-10

Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.

translated by 谷歌翻译

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

Yifang Chen , Karthik Sankararaman , Alessandro Lazaric , Matteo Pirotta , Dmytro Karamshuk , Qifan Wang , Karishma Mandyam , Sinong Wang , Han Fang

分类：机器学习 | 人工智能

2022-11-04

Active learning with strong and weak labelers considers a practical setting where we have access to both costly but accurate strong labelers and inaccurate but cheap predictions provided by weak labelers. We study this problem in the streaming setting, where decisions must be taken \textit{online}. We design a novel algorithmic template, Weak Labeler Active Cover (WL-AC), that is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy. Prior active learning algorithms with access to weak labelers learn a difference classifier which predicts where the weak labels differ from strong labelers; this requires the strong assumption of realizability of the difference classifier (Zhang and Chaudhuri,2015). WL-AC bypasses this \textit{realizability} assumption and thus is applicable to many real-world scenarios such as random corrupted weak labels and high dimensional family of difference classifiers (\textit{e.g.,} deep neural nets). Moreover, WL-AC cleverly trades off evaluating the quality with full exploitation of weak labelers, which allows to convert any active learning strategy to one that can leverage weak labelers. We provide an instantiation of this template that achieves the optimal query complexity for any given weak labeler, without knowing its accuracy a-priori. Empirically, we propose an instantiation of the WL-AC template that can be efficiently implemented for large-scale models (\textit{e.g}., deep neural nets) and show its effectiveness on the corrupted-MNIST dataset by significantly reducing the number of labels while keeping the same accuracy as in passive learning.

translated by 谷歌翻译

Bandits with Knapsacks beyond the Worst-Case

Karthik Abinav Sankararaman , Aleksandrs Slivkins

分类：机器学习 | (统计)机器学习

2020-02-01

带背包（BWK）的匪徒是供应/预算约束下的多武装匪徒的一般模型。虽然BWK的最坏情况遗憾的遗憾是良好的理解，但我们提出了三种结果，超出了最坏情况的观点。首先，我们提供上下界限，其数量为对数，实例相关的后悔率的完整表征。其次，我们考虑BWK中的“简单遗憾”，在给定回合追踪算法的性能，并证明它在除了几轮之外的一切。第三，我们提供从BWK到匪徒的一般“减少”，这利用了一些已知的有用结构，并将这种减少应用于组合半刺点，线性上下文匪徒和多项式登录匪徒。我们的成果从\ CiteT {AgraWaldevanur-EC14}的BWK算法构建，提供了新的分析。

translated by 谷歌翻译

Adversarial Bandits with Knapsacks

Nicole Immorlica , Karthik Abinav Sankararaman , Robert Schapire , Aleksandrs Slivkins

分类：机器学习 | (统计)机器学习

2018-11-28

我们考虑带有背包的土匪（从此以后，BWK），这是一种在供应/预算限制下的多臂土匪的通用模型。特别是，强盗算法需要解决一个众所周知的背包问题：找到最佳的物品包装到有限尺寸的背包中。 BWK问题是众多激励示例的普遍概括，范围从动态定价到重复拍卖，再到动态AD分配，再到网络路由和调度。尽管BWK的先前工作集中在随机版本上，但我们开创了可以在对手身上选择结果的另一个极端。与随机版本和“经典”对抗土匪相比，这是一个更加困难的问题，因为遗憾的最小化不再可行。相反，目的是最大程度地减少竞争比率：基准奖励与算法奖励的比率。我们设计了一种具有竞争比O（log t）的算法，相对于动作的最佳固定分布，其中T是时间范围；我们还证明了一个匹配的下限。关键的概念贡献是对问题的随机版本的新观点。我们为随机版本提出了一种新的算法，该算法是基于重复游戏中遗憾最小化的框架，并且与先前的工作相比，它具有更简单的分析。然后，我们为对抗版本分析此算法，并将其用作求解后者的子例程。

translated by 谷歌翻译